A high speed transcription interface for annotating primary linguistic data
نویسندگان
چکیده
We present a new transcription mode for the annotation tool ELAN. This mode is designed to speed up the process of creating transcriptions of primary linguistic data (video and/or audio recordings of linguistic behaviour). We survey the basic transcription workflow of some commonly used tools (Transcriber, BlitzScribe, and ELAN) and describe how the new transcription interface improves on these existing implementations. We describe the design of the transcription interface and explore some further possibilities for improvement in the areas of segmentation and computational enrichment of annotations.
منابع مشابه
Annotating Syllable Corpora with Linguistic Data Categories in XML
The usefulness of high quality annotated corpora as a development aid in computational linguistic applications is now well understood. Therefore it is necessary to have systematic, easily understandable and effective means for annotating corpora at many levels of linguistic description using. This paper presents a three step methodology for annotating speech corpora using linguistic data catego...
متن کاملThe Annotation Graph Toolkit: Software Components for Building Linguistic Annotation Tools
Annotation graphs provide an efficient and expressive data model for linguistic annotations of time-series data. This paper reports progress on a complete software infrastructure supporting the rapid development of tools for transcribing and annotating time-series data. This general-purpose infrastructure uses annotation graphs as the underlying model, and allows developers to quickly create sp...
متن کاملTools for hierarchical annotation of typed dialogue
We discuss a set of tools for annotating a complex hierarchical and linguistic structure of tutorial dialogue based on the NITE XML Toolkit (NXT) (Carletta et al., 2003). The NXT API supports multi-layered stand-off data annotation and synchronisation with timed and speech data. Using NXT, we built a set of extensible tools for detailed structure annotation of typed tutorial dialogue, collected...
متن کاملExploring Lexical Patterns in Text: Lexical Cohesion Analysis with WordNet
We present a system for the linguistic exploration and analysis of lexical cohesion in English texts. Using an electronic thesaurus-like resource, Princeton WordNet, and the Brown Corpus of English, we have implemented a process of annotating text with lexical chains and a graphical user interface for inspection of the annotated text. We describe the system and report on some sample linguistic ...
متن کاملA preconditioned solver for sharp resolution of multiphase flows at all Mach numbers
A preconditioned five-equation two-phase model coupled with an interface sharpening technique is introduced for simulation of a wide range of multiphase flows with both high and low Mach regimes. Harten-Lax-van Leer-Contact (HLLC) Riemann solver is implemented for solving the discretized equations while tangent of hyperbola for interface capturing (THINC) interface sharpening method is applied ...
متن کامل